IMORC: An infrastructure and architecture template for implementing high-performance reconfigurable FPGA accelerators
نویسندگان
چکیده
The design, implementation and optimization of FPGA accelerators is a challenging task, especially when the accelerator comprises multiple compute cores distributed across CPU and FPGA resources and memories and exhibits data-dependent runtime behavior. In order to simplify the development of FPGA accelerators we propose IMORC, an infrastructure and architecture template that helps raising the level of abstraction. The IMORC development flow bases on a modeling technique for visualizing an application’s communication demand and an architecture template that aids the developer in implementing the design. The architectural template consists of a versatile on-chip interconnect with asynchronous FIFOs and bitwidth conversion placed into the communication links, a performance monitoring infrastructure for collecting performance information during runtime and a set of generic infrastructure cores which are frequently needed in accelerator designs. We demonstrate the usefulness of the IMORC development flow by means of the case study of accelerating the k-th nearest neighbor thinning problem, where IMORC greatly helps us in understanding the communication demand and in implementing the application. With the integrated performance monitoring infrastructure, we gain insights into the data-dependent behavior of the accelerator that helps us in identifying bottlenecks and optimizing the accelerator to achieve a speedup of 10×-40× over an optimized CPU implementation.
منابع مشابه
IMORC: An infrastructure for performance monitoring and optimization of reconfigurable computers
For many years academic research has studied the use of application-specific coprocessors based on field-programmable gate arrays (FPGAs) to accelerate high-performance computing (HPC) applications. Since major supercomputer vendors now provide servers with integrated reconfigurable accelerators, this technology is available to a much broader group of users. Still, designing an accelerator and ...
متن کاملFPGA Acceleration of Communication-Bound Streaming Applications: Architecture Modeling and a 3D Image Compositing Case Study
Reconfigurable computers usually provide a limited number of different memory resources, such as host memory, external memory, and on-chip memory with different capacities and communication characteristics. A key challenge for achieving highperformance with reconfigurable accelerators is the efficient utilization of the available memory resources. A detailed knowledge of the memories’ parameter...
متن کاملمدل عملکردی تحلیلی FPGA برای پردازش با قابلیت پیکربندی مجدد
Optimizing FPGA architectures is one of the key challenges in digital design flow. Traditionally, FPGA designers make use of CAD tools for evaluating architectures in terms of the area, delay and power. Recently, analytical methods have been proposed to optimize the architectures faster and easier. A complete analytical power, area and delay model have received little attention to date. In addi...
متن کاملA Scalable and Reconfigurable Shared-Memory Graphics Cluster Architecture
If the computational demands of an interactive graphics rendering application cannot be met by a single commodity Graphics Processing Unit (GPU), multiple graphics accelerators may be utilised on multi-GPU based systems such as SLI [1] or Crossfire [2] or by a cluster of PCs in conjunction with a software infrastructure. Typically these PC cluster solutions allow the application programmer to u...
متن کاملA Soft Processor Overlay with Tightly-coupled FPGA Accelerator
FPGA overlays are commonly implemented as coarse-grained reconfigurable architectures with a goal to improve designers’ productivity through balancing flexibility and ease of configuration of the underlying fabric. To truly facilitate full application acceleration, it is often necessary to also include a highly efficient processor that integrates and collaborates with the accelerators while mai...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Microprocessors and Microsystems - Embedded Hardware Design
دوره 36 شماره
صفحات -
تاریخ انتشار 2012